AITopics | largest eigenvalue

Machine unlearning, the ability to erase the effect of specific training samples without retraining from scratch, is critical for privacy, regulation, and efficiency. However, most progress in unlearning has been empirical, with little theoretical understanding of when and why unlearning works. We tackle this gap by framing unlearning through the lens of asymptotic linear stability to capture the interaction between optimization dynamics and data geometry. The key quantity in our analysis is data coherence which is the cross sample alignment of loss surface directions near the optimum. We decompose coherence along three axes: within the retain set, within the forget set, and between them, and prove tight stability thresholds that separate convergence from divergence. To further link data properties to forgettability, we study a two layer ReLU CNN under a signal plus noise model and show that stronger memorization makes forgetting easier: when the signal to noise ratio (SNR) is lower, cross sample alignment is weaker, reducing coherence and making unlearning easier; conversely, high SNR, highly aligned models resist unlearning. For empirical verification, we show that Hessian tests and CNN heatmaps align closely with the predicted boundary, mapping the stability frontier of gradient based unlearning as a function of batching, mixing, and data/model alignment. Our analysis is grounded in random matrix theory tools and provides the first principled account of the trade offs between memorization, coherence, and unlearning.

artificial intelligence, machine learning, memorization, (19 more...)

arXiv.org Machine Learning

2602.02986

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Understanding Benign Overfitting in Gradient-based Meta Learning

Neural Information Processing SystemsNov-15-2025, 06:18:11 GMT

Meta learning has demonstrated tremendous success in few-shot learning with limited supervised data. In those settings, the meta model is usually overparameterized.

excess risk, meta, probability, (12 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York > Rensselaer County > Troy (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

On Theoretical Interpretations of Concept-Based In-Context Learning

Tang, Huaze, Peng, Tianren, Huang, Shao-lun

arXiv.org Artificial IntelligenceOct-17-2025

In-Context Learning (ICL) has emerged as an important new paradigm in natural language processing and large language model (LLM) applications. However, the theoretical understanding of the ICL mechanism remains limited. This paper aims to investigate this issue by studying a particular ICL approach, called concept-based ICL (CB-ICL). In particular, we propose theoretical analyses on applying CB-ICL to ICL tasks, which explains why and when the CB-ICL performs well for predicting query labels in prompts with only a few demonstrations. In addition, the proposed theory quantifies the knowledge that can be leveraged by the LLMs to the prompt tasks, and leads to a similarity measure between the prompt demonstrations and the query input, which provides important insights and guidance for model pre-training and prompt engineering in ICL. Moreover, the impact of the prompt demonstration size and the dimension of the LLM embeddings in ICL are also explored based on the proposed theory. Finally, several real-data experiments are conducted to validate the practical usefulness of CB-ICL and the corresponding theory. With the great successes of large language models (LLMs), In-context learning (ICL) has emerged as a new paradigm for natural language processing (NLP) (Brown et al., 2020; Chowdhery et al., 2023; Achiam et al., 2023), where LLMs addresses the requested queries in context prompts with a few demonstrations.

demonstration, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.20882

Country: Asia > China (0.28)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

26ded5c8ee8ec1bc4caced4e1c9b1584-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-3-2025, 01:56:45 GMT

artificial intelligence, machine learning, probability, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.67)

Add feedback

neurips_Graphs_supp.pdf

Neural Information Processing SystemsAug-16-2025, 13:19:55 GMT

And if they are too close to each other, it is also quite clear that spectral methods will not work. However, we highlight these statements in Figure 1. Then we will proceed to prove technical statements made in Section 1. Let us first consider a preliminary remark on the connectivity of the random graph. Then one has that P ( 9 an isolated vertex i, 1 i N)! 0 as N!1 .

eigenvalue, probability, vertex, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.34)

Add feedback